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A three-dimensional model of the carbohydrate recognition 
domain of a rat macrophage C-type lectin has been con* 
structed by comparative modeling and assessed by inverse 
folding analysis. Comparative modeling in the presence of 
tow sequence similahty was based on informQiion provided 
by comparison of X-ray structures and sequence-structure 
alignments. The sequence-structure compatibility of the 
model was sound. Its binding site was analyzed in compari- 
son to the X-ray structure of a galactose'Specific mutant of 
the marmose-binding protein. The specificity qf the macro- 
phage lectin was discussed in light of mutagenesis data on 
asialoglycoprotein receptors. © 1 996 by Elsevier Science 
Inc. 

Keywords: protein structure prediction, sequence similar- 
ity, structural similarity, comparative modeling, model as- 
sessment, protein superfamily, computer graphics analysis, 
macrophage lectin, carbohydrate binding 



INTRODUCTION 

Members of proiein superfaxnities are thought to adopt simi- 
lar global folds, despite sharing only limited sequence simi* 
larity. Protein superfamilies appear attractive as targett for 
compaiBtivc modeling/ if the stnicture of at least one mem- 
ber has been determined. It may then be possible to generate 
approximate three-dimensional models for other members 
of the family. However, the low level of sequence similarity 
shared by protein superfamily members* often 30% or less, 
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makes it difficult to generate topologically meaningful 
alignments relative to structural template($)J and hence the 
accuracy of such models may be limited and insufficient for 
more detailed analysis. This is particularly problematic in 
cases where information from structure comparison is not 
(yet) available to complement sequence alignments. 

Calcium-dependent (Ctype) lectins fonn a protein super- 
family that includes a variety of mammalian carbohydrate- 
binding proteins.^ C^type lectin domains specifically bind 
mono- or oligosaccharides in a calcium-dependent fashion 
and function as carbohydrate recognition modules;^ The C- 
type lectin domain of the rat mannose-binding protein 
(MBP) was the first structure of a C-type lectin domain 
determined^ and revealed a previously unobserved protein 
fold, consisting to -50% of loops and other extended re- 
gions of unusual secondary structure. Structures of trimetric 
homologs of rat MBP"*-** and of a mutant form of rat MBP. 
which is primarily specific for galaaose {Oal)7*" have also 
been determined. In addition to these ctosely related mol- 
ecules, the stnicture of the C-type lectin domain of E- 
selectin. a cell adhesion molecule, became available,^ and 
the MBP and E-selectin structures have been compared in 
detail.*' Thus, information from structure comparison is 
available to improve the accuracy of multiple sequence 
alignments of C-iype lectins''^ and hence the ability to con- 
struct three-dimensional C-type lectin models by compara- 
tive model building. 

In this study, it was attempted to butid an accurate model 
of the extracellular carbohydrate recognition doma'm of the 
rat macrophage lectin (ML).'' Macrophage lectin is a type 
II transmembrane glycoprotein receptor and includes a car- 
boxy-terminal C-type lectin domain with specificity for Gal 
and //-acetylgalactosamine (GalNAc)." Macrophage lectin 
is implicated in carbohydrate-dependent endocytosis by 
macrophages'' and closely related to the rat hepatic lectin/ 
asialoglycoprotein receptor (RHL),'" which also binds Gal. 
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but preferentially GalNAc.'*^ The ML model was generated 
on Ihe basis of E-selectin/MBP structure comparison and 
using MBP as structural template. The Kcqucnce-structurc 
fitness of ihe finalized model was assessed by cnei^gy profile 
analysis. The carbohydrate-binding site of the ML model 
was analyzed in comparison to a galactose-specific mutant 
of the mannose-binding protein and considering mutagen- 
esis data. 

METHODS 

Topological sequence comparison 

X-ray structures of E-selectin at 2.0-A resolution^ and of 
MBP at L7-A resolution'* were compared by backbone 
superposition as described.'" Briefly, structurally conserved 
regions were identified by sequential least-squares superpo- 
siiion of backbone segments of increasing length followed 
by root mean square deviation (rmsd) comparison. Back- 
bone .segments that superimposed with un rmsd of less than 

1 A were determined* and a .structure-based sequence align- 
ment was generated. This structure-oriented alignment was 
complemented by sequence comparison with ML and pro- 
vided the basis for comparative prediction of its structure. 

Comparative model building and loop modeling 

Structural manipulations and computer graphics analysis 
were carried out using Insightll (MSI. San Diego. CA). 
Backbone regions structurally conserved in MBP and E- 
selectin provided the framework for the ML model and were 
copied from MBP, With the exception of segment 257-265 
(see below), the conformations of regions including inser- 
tions and deletions and other structurally variable (loop) 
regioas in MBP and E-setectin were approximated by con- 
formational search with CX)NGEN.''^ In these calculations, 
main-chain conformational space was uniformly searched in 
30^ torsion angle increments, and side-chain conformations 
were modeled using an iterative search procedure.'*^ Con- 
formations with acceptable interatom interactions and po- 
tential energy were sampled and, for each modeled loop, Ihe 
conformation with lowest solvent-accc^isible surface within 

2 kcal of the energy minimum conformation was selected 
and included in the model. Segment 257-265 was modeled 
using the corresponding loop conformation in the X-ray 
structure" of a galactosc-binding MBP mutant (gbMBP).^ 
This loop was included in the model using Insight's Loop 
Splice routine. Side-chain replacements were modeled via 
computer graphics in conformatioas as similar as possible to 
the original conformation or, for snucturally unconstrained 
positions, using a rotamer search procedure^^ implemented 
in Insightll In these rounmer search calculations, an 8-A 
cutoff distance was used for nonbonded interactions. Color 
figures were generated using Insightll (version 95.0) on an 
SGI Indigo Impact and processed as RGB files. 

Model refinement and assessment 

The initially assembled model was refined by energy mini- 
mization with Discover (MSI, San Diego, CA) until the rms 
derivative of the energy function was -I kcal/A, in these 
calculations, a distance-dependent dielectric constant (\r) 



and a I5-A cutoff distance for the treatment of nonbonded 
interactions were used. During the energy minimization cal- 
culations, residues predicted to participate in the formation 
of calcium-binding sites were constrained to their original 
positions. The stereochemical quality of the refined model 
was confirmed and its sequence-structure compatibility was 
assessed by energy profile analysis using PROSAII (version 
3,0).*^ Pairwise residues intcraciion energies were calcu- 
lated using p-carbon interactions and, for graphical repre- 
sentation and analysis of the profiles, a 50-residue window 
was used for energy averaging at each residue position." 

Binding site modeling 

The position of the /V-acctylgalactosamine (GalNAc) ligand 
in the Mt^binding site was modeled by superposition of 
structurally conserved regions of the gbMBP-OalNAc 
stmcture** and the ML model, followed by transfer of the 
ligand in ils crystaDographic conformation. Mutations in the 
binding site region were modeled using Insight*s Replace 
command, followed by a search for the lowest energy side- 
chain rotamer conformation.** 

RESULTS AND DISCUSSION 
Structure-oriented sequence comparison 

The sequences of the carbohydrate recognition domains of 
ML MBP, and E-selectin are less than 30% identical. Com- 
parison of the MBP and E-seleciin X-ray stnictures reveals 
significant difTerences in some regions, altlmgh both pro- 
teins adopt the same fold. This information was incorpo- 
rated in a sequence alignment of MBP and £-selectin, which 
thus refiects the spatial equivalence of residues (Figure 1). 
The sequence of ML was aligned against this template by 
matching core residues, structurally constrained positions, 
and consensus residues in structurally conserved regions. 
Using these criteria, regions smicturally conserved in MBP, 
E-selectin. and ML could be assigned with confidence. 
Likewise, it was possible to identify the regions including 
inscnions and deletions, all of which mapped to structurally 
variable regions (Figure 1). A schematic repre.sentation of 
the C-iype lectin fold is shown in Color Plate 1 which 
highlights the aaivaied secondary structure elements and 
extended regions of unusual secondary smicture. 

Model building and assessment 

Backbone segments outside structurally conserved regions 
were considered variable, and their conformations were ap- 
proximated using the loop modeling techniques described 
in Methods. The sequence-structure compatibility of the 
completed ML model was u.s.ses.scd by energy profile analy- 
sis,' a structure assessnwnt technique belonging to the in- 
ver^ie folding approach.'" These methods do not detect in- 
correctly modeled loop or side-chain conformatiorus. which 
limit prediction accuracy. However, invert folding tech- 
niques have made it possible to analyze Ihe sequence- 
structure compatibility of a model and thus to assess its 
overall reliability. As shown in Figure 2, the average residue 
interaction energy is negative at each position in the model. 
This profile is consistent with an overall correctly folded 
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stnicture with no substantial errors in core regions.'^ Thus, 
the energy profile analysis carried out here indicated that the 
ML model was .sound and sufficiently accurate to analyze 
some details. 

In a previous report, model building of E-selectin and 
comparison with its later determined X-ray structure was 
described.'^ The approach was similar to the one presented 
herein in that the selectin model was built on the basis of the 
MBP X-ray structure and structure -oriented sequence com* 
parison. The model was shown to be in good agreement 
with its X-ray structure, and good agreement was observed 
in the carbohydrate-binding region. The major difference 
between the previous and present study is that no informa- 
tion from C-type lectin structure comparison was available 
at the lime of selectin modeling. This information was used 
to aid in the mode! building of ML and is thought to further 
increase modeling accuracy. 

The macrophage lectin molecular model and its 
carbohydrate-binding site 

Color Plate 2 shows the ML model superpased on the MBP 
and E-selectin X-ray structures. The major differences in 
these structures occur in extended loop regions. The func- 
tional calcium-binding site, which Is shared by these pro- 
teins, is directly involved in carbohydrate binding to 
MBP.'^ Compared to MBP, the functional calcium coordi- 
nation sphere in ML contains two residue replacements 
(ML/MBP: Q253/E and D255/N) that favor the coordina- 
tion of galactose over mannose.^ Residues that participate 
in the formation of the functional calcium-binding site in 
ML and residues that suntound this site are shown in Color 
Plate 3. The representation outlines the predicted carbohy- 




Residues 

figure 2. Energy profile of (he ML model. Pairwise average 
residue interaaUm energy is given in units of E/kT (E, in- 
teraction energy [in kval/molf; k, Boltzmann constant; T, 
temperature in degrees Kelvin) and plotted against residue 
p<KKitions, The interaction energy was calculated at each 
residue position, using a 50*residue window far energy av- 
eraging. 



dratc-binding site. Residue W257, part of the glycinc-rich 
loop 257-265. is critical for Gal binding.' This region in 
ML displays a five-residue insertion relative to MBP and 
E-sclcctin and correctly positions W257 relative to the cal- 
cium coordination sphere. To study the ML binding site in 
more detail, the position of the GalNAc ligand was modeled 
on the basis of its orientation and conformation observed in 
the gbMBP-^alNAc complex. 

No unfavorable contacts were detected in the model after 
transfer of the GalNAc ligand from the gbMBP-GalNAc 
X-ray structure (see Methods). The pentagonal-bipyramidal 
geometiy of the calcium-binding site is con.served. The Gal 
0-3 and 04 participate in the formation of the calcium- 
binding site and share an apical position. The contacts 
within the calcium coordination sphere in the gbMBP- 
GalNAc X-ray structure and the ML-GalNAc model are 
similar. In the model, the 0-3-<alcium and the 0-4-calcium 
distance are both 2.50 A, whereas the corresponding crys- 
tallographic contact distances arc 2,52 and 2.56 A. respec- 
tively. In the X-ray .structure, the calcium-ligand distances 
range from 2.27 to 2 Ji6 A (average distance, 2.46 A), and in 
the model the corresponding distances range from 2.30 to 
2.58 A (average, 2.45 A). Interactions between the Gal moi- 
ety, the calcium coordination sphere, and residue W257 
anchored the ligand in the calcium-binding site and deter- 
mined its orientation. Only the ^-acetyl position was 
slightly adjusted. Color Plate 4 shows a detailed view of this 
model. However, the modeled complex can be considered 
only as a first approximation and must be interpreted with 
caution. 

Structure-function aspects 

On the basis of the model binding of the GalNAc A/-acetyl 
group is thought to involve a subsitc in ML formed by 
residues H270, R284, and Y286, which are. on average, at 
a distance of -8 A from the functional calcium. The location 
of this region approximately corresponds to the carbohy- 
drate-binding site in E- and P-selectin.'''"'-^ Mutagenesis 
studies on MBP^**" and asialoglycoprotein receptore" have 
idcmified some residues important for Gal and GalNAc 
binding. In the model, residue H270. conserved in ML and 
RHL and critical for GalNAc binding,**' is within van der 
Waals contact distance of the A^-acclyl group. Residue H270 
is predicted to correspond spatially to a glutamic acid in the 
selectins, which is important for carbohydrate binding.** 
Rat hepatic tectin has a greater preference for binding Gal- 
NAc than ML^ which is mainly a consequence of four resi- 
due changes (ML/RHL: V222/N, A250/R. K252/G, S273/ 
T).'* The model suggested that only two of these mutations, 
A250/R and V222/N, affect ligand binding directly. How- 
ever, two other residues, Y286 and R2S4. both conserved in 
ML and RHL, are within contact distance of each other and 
the yV-acetyl group (Color Plate 4). Asparagine instead of 
valine at position 222 brings the asparagine side chain to 
within contact distance of both the yv-acetyl group of Gal- 
NAc and the side chains of Y286 and H270. This network 
of interactions provides an explanation for the observed 
differences in GalNAc binding. The .side chain of a serine at 
position 222, as seen in gbMBP, which does not show RHL- 
likc selectivity for GalNAc,** is loo short to form compa- 
rable interactions. 
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SUMMARY 

The ML molecular model was constnicted in the presence 
of low sequence identity with other C-type lectins. This was 
possible by combining sequence and structure comparison. 
Inverse folding analysis suggested that the accuracy of the 
model was sufficient to analyze some of its details. The 
predicted ML binding site was inspected, and residues im- 
portant for carbohydrate binding were mapped. The major- 
ity of residues in the binding site are pan of conserved 
regions of the C-type lectin fold. Thus, residues at these 
positions may be of more general importance lo modulate 
carbohydrate specificities of members of the C-type lectin 
superfamily. 
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Color Plate 1 . Schematic stereo representation (16a) of the MBP C-type lectin fold. Secondary structure elements are defmed 
according to Kabsch and Sander (16b). Helices are depicted as red cylinders (al, a2) and p strands (bl-bS) as green arrows 
and labeled according to Figure 1. Loops and regions of extended unusual secondary structure are colored gold The position 
of the functional calcium in MBP is shown as a magenta sphere. 




Color Plate 2. Comparison of C-Qpe lectin (kmuias. The backbone segments of structuraily conserved regions according to Figure \ 
in the MBP and E^lect'in X-ray stnictunes and the ML model were superimposed. MBP. E-seleciin. and ML are shown as solid ribbons 
in blue, silver, and gold, respectively. The calciurrvtindi(\g site shared by the three lectins is shown as a magenta sphere. Compared to 
Ca\€f Plate K the orientation is obtained by approximately 90 degiee rotation around the vertical axis (so that helix a2 is in the back). 



J. Mol. Graphics, 19%. Vol. 14. October 283 





Color Plate 3. The predicted binding site region in ihe ML model. The oriemalion of the model is similar to that in Color Plate 
I and IS focused on the functional calcium (magenta). The side chains of ML residues that participate in the calcium 
coordmation sphere are colored pink and residues surrounding this site are shown in green. W257 is the green residue at the 
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Color Plate 4, Close-up view of the carbohydrate-binding site in ML. The orientation is the same as in Color Plate 3. Residues 
m the bmdmg site region are shown and labeled. Also shown is the modeled position of the GalNAc ligand (red). 
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